Proteogenomics produces comprehensive and highly accurate protein-coding gene annotation in a complete genome assembly of Malassezia sympodialis
نویسندگان
چکیده
Complete and accurate genome assembly and annotation is a crucial foundation for comparative and functional genomics. Despite this, few complete eukaryotic genomes are available, and genome annotation remains a major challenge. Here, we present a complete genome assembly of the skin commensal yeast Malassezia sympodialis and demonstrate how proteogenomics can substantially improve gene annotation. Through long-read DNA sequencing, we obtained a gap-free genome assembly for M. sympodialis (ATCC 42132), comprising eight nuclear and one mitochondrial chromosome. We also sequenced and assembled four M. sympodialis clinical isolates, and showed their value for understanding Malassezia reproduction by confirming four alternative allele combinations at the two mating-type loci. Importantly, we demonstrated how proteomics data could be readily integrated with transcriptomics data in standard annotation tools. This increased the number of annotated protein-coding genes by 14% (from 3612 to 4113), compared to using transcriptomics evidence alone. Manual curation further increased the number of protein-coding genes by 9% (to 4493). All of these genes have RNA-seq evidence and 87% were confirmed by proteomics. The M. sympodialis genome assembly and annotation presented here is at a quality yet achieved only for a few eukaryotic organisms, and constitutes an important reference for future host-microbe interaction studies.
منابع مشابه
Dual use of peptide mass spectra: Protein atlas and genome annotation
One of the objectives of genome science is the discovery and accurate annotation of all protein-coding genes. Proteogenomics has emerged as a methodology that provides orthogonal information to traditional forms of evidence used for genome annotation. By this method, peptides that are identified via tandem mass spectrometry are used to refine protein-coding gene models. Namely, these peptides a...
متن کاملIntegrating transcriptomic and proteomic data for accurate assembly and annotation of genomes.
Complementing genome sequence with deep transcriptome and proteome data could enable more accurate assembly and annotation of newly sequenced genomes. Here, we provide a proof-of-concept of an integrated approach for analysis of the genome and proteome of Anopheles stephensi, which is one of the most important vectors of the malaria parasite. To achieve broad coverage of genes, we carried out t...
متن کاملDiscovery of rare protein-coding genes in model methylotroph Methylobacterium extorquens AM1.
Proteogenomics involves the use of MS to refine annotation of protein-coding genes and discover genes in a genome. We carried out comprehensive proteogenomic analysis of Methylobacterium extorquens AM1 (ME-AM1) from publicly available proteomics data with a motive to improve annotation for methylotrophs; organisms capable of surviving in reduced carbon compounds such as methanol. Besides identi...
متن کاملComprehensive Annotation of the Parastagonospora nodorum Reference Genome Using Next-Generation Genomics, Transcriptomics and Proteogenomics
Parastagonospora nodorum, the causal agent of Septoria nodorum blotch (SNB), is an economically important pathogen of wheat (Triticum spp.), and a model for the study of necrotrophic pathology and genome evolution. The reference P. nodorum strain SN15 was the first Dothideomycete with a published genome sequence, and has been used as the basis for comparison within and between species. Here we ...
متن کاملGenomic Insights into the Atopic Eczema-Associated Skin Commensal Yeast Malassezia sympodialis
UNLABELLED Malassezia commensal yeasts are associated with a number of skin disorders, such as atopic eczema/dermatitis and dandruff, and they also can cause systemic infections. Here we describe the 7.67-Mbp genome of Malassezia sympodialis, a species associated with atopic eczema, and contrast its genome repertoire with that of Malassezia globosa, associated with dandruff, as well as those of...
متن کاملذخیره در منابع من
با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید
عنوان ژورنال:
دوره 45 شماره
صفحات -
تاریخ انتشار 2017